Pesquisa | Portal Regional da BVS

SnapKin: a snapshot deep learning ensemble for kinase-substrate prediction from phosphoproteomics data.

Xiao, Di; Lin, Michael; Liu, Chunlei; Geddes, Thomas A; Burchfield, James G; Parker, Benjamin L; Humphrey, Sean J; Yang, Pengyi.

NAR Genom Bioinform ; 5(4): lqad099, 2023 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-37954574

RESUMO

A major challenge in mass spectrometry-based phosphoproteomics lies in identifying the substrates of kinases, as currently only a small fraction of substrates identified can be confidently linked with a known kinase. Machine learning techniques are promising approaches for leveraging large-scale phosphoproteomics data to computationally predict substrates of kinases. However, the small number of experimentally validated kinase substrates (true positive) and the high data noise in many phosphoproteomics datasets together limit their applicability and utility. Here, we aim to develop advanced kinase-substrate prediction methods to address these challenges. Using a collection of seven large phosphoproteomics datasets, and both traditional and deep learning models, we first demonstrate that a 'pseudo-positive' learning strategy for alleviating small sample size is effective at improving model predictive performance. We next show that a data resampling-based ensemble learning strategy is useful for improving model stability while further enhancing prediction. Lastly, we introduce an ensemble deep learning model ('SnapKin') by incorporating the above two learning strategies into a 'snapshot' ensemble learning algorithm. We propose SnapKin, an ensemble deep learning method, for predicting substrates of kinases from large-scale phosphoproteomics data. We demonstrate that SnapKin consistently outperforms existing methods in kinase-substrate prediction. SnapKin is freely available at https://github.com/PYangLab/SnapKin.

Akt phosphorylates insulin receptor substrate to limit PI3K-mediated PIP3 synthesis.

Kearney, Alison L; Norris, Dougall M; Ghomlaghi, Milad; Kin Lok Wong, Martin; Humphrey, Sean J; Carroll, Luke; Yang, Guang; Cooke, Kristen C; Yang, Pengyi; Geddes, Thomas A; Shin, Sungyoung; Fazakerley, Daniel J; Nguyen, Lan K; James, David E; Burchfield, James G.

Elife ; 102021 07 13.

Artigo em Inglês | MEDLINE | ID: mdl-34253290

RESUMO

The phosphoinositide 3-kinase (PI3K)-Akt network is tightly controlled by feedback mechanisms that regulate signal flow and ensure signal fidelity. A rapid overshoot in insulin-stimulated recruitment of Akt to the plasma membrane has previously been reported, which is indicative of negative feedback operating on acute timescales. Here, we show that Akt itself engages this negative feedback by phosphorylating insulin receptor substrate (IRS) 1 and 2 on a number of residues. Phosphorylation results in the depletion of plasma membrane-localised IRS1/2, reducing the pool available for interaction with the insulin receptor. Together these events limit plasma membrane-associated PI3K and phosphatidylinositol (3,4,5)-trisphosphate (PIP3) synthesis. We identified two Akt-dependent phosphorylation sites in IRS2 at S306 (S303 in mouse) and S577 (S573 in mouse) that are key drivers of this negative feedback. These findings establish a novel mechanism by which the kinase Akt acutely controls PIP3 abundance, through post-translational modification of the IRS scaffold.

For the body to work properly, cells must constantly 'talk' to each other using signalling molecules. Receiving a chemical signal triggers a series of molecular events in a cell, a so-called 'signal transduction pathway' that connects a signal with a precise outcome. Disturbing cell signalling can trigger disease, and strict control mechanisms are therefore in place to ensure that communication does not break down or become erratic. For instance, just as a thermostat turns off the heater once the right temperature is reached, negative feedback mechanisms in cells switch off signal transduction pathways when the desired outcome has been achieved. The hormone insulin is a signal for growth that increases in the body following a meal to promote the storage of excess blood glucose (sugar) in muscle and fat cells. The hormone binds to insulin receptors at the cell surface and switches on a signal transduction pathway that makes the cell take up glucose from the bloodstream. If the signal is not engaged diseases such as diabetes develop. Conversely, if the signal cannot be adequately switched of cancer can develop. Determining exactly how insulin works would help to understand these diseases better and to develop new treatments. Kearney et al. therefore set out to examine the biochemical 'fail-safes' that control insulin signalling. Experiments using computer simulations of the insulin signalling pathway revealed a potential new mechanism for negative feedback, which centred on a molecule known as Akt. The models predicted that if the negative feedback were removed, then Akt would become hyperactive and accumulate at the cell's surface after stimulation with insulin. Further manipulation of the 'virtual' insulin signalling pathway and studies of live cells in culture confirmed that this was indeed the case. The cell biology experiments also showed how Akt, once at the cell surface, was able to engage the negative feedback and shut down further insulin signalling. Akt did this by inactivating a protein required to pass the signal from the insulin receptor to the rest of the cell. Overall, this work helps to understand cell communication by revealing a previously unknown, and critical component of the insulin signalling pathway.

Assuntos

Fosfatidilinositol 3-Quinase/metabolismo , Fosfatidilinositol 3-Quinases/metabolismo , Proteínas Proto-Oncogênicas c-akt/metabolismo , Receptor de Insulina/metabolismo , Animais , Antígenos CD , Membrana Celular/metabolismo , Biologia Computacional , Glucose/metabolismo , Humanos , Insulina/metabolismo , Proteínas Substratos do Receptor de Insulina/metabolismo , Alvo Mecanístico do Complexo 1 de Rapamicina , Camundongos , Fosforilação , Transdução de Sinais/fisiologia

CiteFuse enables multi-modal analysis of CITE-seq data.

Kim, Hani Jieun; Lin, Yingxin; Geddes, Thomas A; Yang, Jean Yee Hwa; Yang, Pengyi.

Bioinformatics ; 36(14): 4137-4143, 2020 08 15.

Artigo em Inglês | MEDLINE | ID: mdl-32353146

RESUMO

MOTIVATION: Multi-modal profiling of single cells represents one of the latest technological advancements in molecular biology. Among various single-cell multi-modal strategies, cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq) allows simultaneous quantification of two distinct species: RNA and cell-surface proteins. Here, we introduce CiteFuse, a streamlined package consisting of a suite of tools for doublet detection, modality integration, clustering, differential RNA and protein expression analysis, antibody-derived tag evaluation, ligand-receptor interaction analysis and interactive web-based visualization of CITE-seq data. RESULTS: We demonstrate the capacity of CiteFuse to integrate the two data modalities and its relative advantage against data generated from single-modality profiling using both simulations and real-world CITE-seq data. Furthermore, we illustrate a novel doublet detection method based on a combined index of cell hashing and transcriptome data. Finally, we demonstrate CiteFuse for predicting ligand-receptor interactions by using multi-modal CITE-seq data. Collectively, we demonstrate the utility and effectiveness of CiteFuse for the integrative analysis of transcriptome and epitope profiles from CITE-seq data. AVAILABILITY AND IMPLEMENTATION: CiteFuse is freely available at http://shiny.maths.usyd.edu.au/CiteFuse/ as an online web service and at https://github.com/SydneyBioX/CiteFuse/ as an R package. CONTACT: pengyi.yang@sydney.edu.au. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Software , Transcriptoma , Epitopos , Perfilação da Expressão Gênica , RNA , Análise de Sequência de RNA , Análise de Célula Única

scReClassify: post hoc cell type classification of single-cell rNA-seq data.

Kim, Taiyun; Lo, Kitty; Geddes, Thomas A; Kim, Hani Jieun; Yang, Jean Yee Hwa; Yang, Pengyi.

BMC Genomics ; 20(Suppl 9): 913, 2019 Dec 24.

Artigo em Inglês | MEDLINE | ID: mdl-31874628

RESUMO

BACKGROUND: Single-cell RNA-sequencing (scRNA-seq) is a fast emerging technology allowing global transcriptome profiling on the single cell level. Cell type identification from scRNA-seq data is a critical task in a variety of research such as developmental biology, cell reprogramming, and cancers. Typically, cell type identification relies on human inspection using a combination of prior biological knowledge (e.g. marker genes and morphology) and computational techniques (e.g. PCA and clustering). Due to the incompleteness of our current knowledge and the subjectivity involved in this process, a small amount of cells may be subject to mislabelling. RESULTS: Here, we propose a semi-supervised learning framework, named scReClassify, for 'post hoc' cell type identification from scRNA-seq datasets. Starting from an initial cell type annotation with potentially mislabelled cells, scReClassify first performs dimension reduction using PCA and next applies a semi-supervised learning method to learn and subsequently reclassify cells that are likely mislabelled initially to the most probable cell types. By using both simulated and real-world experimental datasets that profiled various tissues and biological systems, we demonstrate that scReClassify is able to accurately identify and reclassify misclassified cells to their correct cell types. CONCLUSIONS: scReClassify can be used for scRNA-seq data as a post hoc cell type classification tool to fine-tune cell type annotations generated by any cell type classification procedure. It is implemented as an R package and is freely available from https://github.com/SydneyBioX/scReClassify.

Assuntos

RNA-Seq/métodos , Animais , Humanos , Aprendizado de Máquina , Camundongos , Análise de Célula Única/métodos , Software

Autoencoder-based cluster ensembles for single-cell RNA-seq data analysis.

Geddes, Thomas A; Kim, Taiyun; Nan, Lihao; Burchfield, James G; Yang, Jean Y H; Tao, Dacheng; Yang, Pengyi.

BMC Bioinformatics ; 20(Suppl 19): 660, 2019 Dec 24.

Artigo em Inglês | MEDLINE | ID: mdl-31870278

RESUMO

BACKGROUND: Single-cell RNA-sequencing (scRNA-seq) is a transformative technology, allowing global transcriptomes of individual cells to be profiled with high accuracy. An essential task in scRNA-seq data analysis is the identification of cell types from complex samples or tissues profiled in an experiment. To this end, clustering has become a key computational technique for grouping cells based on their transcriptome profiles, enabling subsequent cell type identification from each cluster of cells. Due to the high feature-dimensionality of the transcriptome (i.e. the large number of measured genes in each cell) and because only a small fraction of genes are cell type-specific and therefore informative for generating cell type-specific clusters, clustering directly on the original feature/gene dimension may lead to uninformative clusters and hinder correct cell type identification. RESULTS: Here, we propose an autoencoder-based cluster ensemble framework in which we first take random subspace projections from the data, then compress each random projection to a low-dimensional space using an autoencoder artificial neural network, and finally apply ensemble clustering across all encoded datasets to generate clusters of cells. We employ four evaluation metrics to benchmark clustering performance and our experiments demonstrate that the proposed autoencoder-based cluster ensemble can lead to substantially improved cell type-specific clusters when applied with both the standard k-means clustering algorithm and a state-of-the-art kernel-based clustering algorithm (SIMLR) designed specifically for scRNA-seq data. Compared to directly using these clustering algorithms on the original datasets, the performance improvement in some cases is up to 100%, depending on the evaluation metric used. CONCLUSIONS: Our results suggest that the proposed framework can facilitate more accurate cell type identification as well as other downstream analyses. The code for creating the proposed autoencoder-based cluster ensemble framework is freely available from https://github.com/gedcom/scCCESS.

Assuntos

Análise de Sequência de RNA , Algoritmos , Análise por Conglomerados , Análise de Dados , Humanos , Redes Neurais de Computação , RNA-Seq , Análise de Célula Única , Transcriptoma

Multiplexed Temporal Quantification of the Exercise-regulated Plasma Peptidome.

Parker, Benjamin L; Burchfield, James G; Clayton, Daniel; Geddes, Thomas A; Payne, Richard J; Kiens, Bente; Wojtaszewski, Jørgen F P; Richter, Erik A; James, David E.

Mol Cell Proteomics ; 16(12): 2055-2068, 2017 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-28982716

RESUMO

Exercise is extremely beneficial to whole body health reducing the risk of a number of chronic human diseases. Some of these physiological benefits appear to be mediated via the secretion of peptide/protein hormones into the blood stream. The plasma peptidome contains the entire complement of low molecular weight endogenous peptides derived from secretion, protease activity and PTMs, and is a rich source of hormones. In the current study we have quantified the effects of intense exercise on the plasma peptidome to identify novel exercise regulated secretory factors in humans. We developed an optimized 2D-LC-MS/MS method and used multiple fragmentation methods including HCD and EThcD to analyze endogenous peptides. This resulted in quantification of 5,548 unique peptides during a time course of exercise and recovery. The plasma peptidome underwent dynamic and large changes during exercise on a time-scale of minutes with many rapidly reversible following exercise cessation. Among acutely regulated peptides, many were known hormones including insulin, glucagon, ghrelin, bradykinin, cholecystokinin and secretogranins validating the method. Prediction of bioactive peptides regulated with exercise identified C-terminal peptides from Transgelins, which were increased in plasma during exercise. In vitro experiments using synthetic peptides identified a role for transgelin peptides on the regulation of cell-cycle, extracellular matrix remodeling and cell migration. We investigated the effects of exercise on the regulation of PTMs and proteolytic processing by building a site-specific network of protease/substrate activity. Collectively, our deep peptidomic analysis of plasma revealed that exercise rapidly modulates the circulation of hundreds of bioactive peptides through a network of proteases and PTMs. These findings illustrate that peptidomics is an ideal method for quantifying changes in circulating factors on a global scale in response to physiological perturbations such as exercise. This will likely be a key method for pinpointing exercise regulated factors that generate health benefits.

Assuntos

Exercício Físico , Peptídeos/análise , Proteoma/química , Proteômica/métodos , Adulto , Linhagem Celular , Cromatografia Líquida , Humanos , Masculino , Proteínas dos Microfilamentos/sangue , Proteínas dos Microfilamentos/química , Proteínas Musculares/sangue , Proteínas Musculares/química , Peptídeos/sangue , Processamento de Proteína Pós-Traducional , Proteólise , Espectrometria de Massas em Tandem

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA